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Abstract 

Importance  sampling  is  a  variance  reduction  technique  for  efficient 
estimation  of  rare-event  probabilities  by  Monte  Carlo.  For  random 
variables  with  heavy  tails  there  is  little  consensus  on  how  to  choose 
the  change  of  measure  used  in  importance  sampling.  In  this  paper  we 
study  dynamic  importance  sampling  schemes  for  sums  of  independent 
and  identically  distributed  random  variables  with  regularly  varying 
tails.  The  number  of  summands  can  be  random  but  must  be  indepen¬ 
dent  of  the  summands.  For  estimating  the  probability  that  the  sum 
exceeds  a  given  threshold,  we  explicitly  identify  a  class  of  dynamic  im¬ 
portance  sampling  algorithms  with  bounded  relative  errors.  In  fact, 
these  schemes  are  nearly  asymptotically  optimal  in  the  sense  that  the 
second  moment  of  the  corresponding  importance  sampling  estimator 
can  be  made  as  close  as  desired  to  the  minimal  possible  value. 


1  Introduction 

Suppose  one  wishes  to  estimate  the  quantity  pb  =  P(S'„  >  6),  where  = 
Xi  -|-  •  •  •  -|-  Xn  and  the  W’s  are  real- valued,  independent  and  identically 
distributed  (iid).  A  simple  and  often  effective  means  is  to  use  Monte  Carlo 
simulation.  One  generates  K  iid  replicas  {iSR  of  the  random  variable  Sn, 
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and  forms  the  estimate  Zk  =  ^^te  of  convergence 

of  Zx  is  determined  by  its  variance: 

var(Zx)  =  {pb-Pb)/K. 


Note  that  pf,  ^  0  as  6  ^  oo  implies  var(Zx)  ^  0  as  6  ^  oo.  However,  when 
estimating  small  probabilities  a  more  important  statistic  is  the  relative  error 
of  the  estimate: 


RE{Zk) 


standard  deviation  of  Zx 
mean  of  Zk 


1 

y/K 


I -Pb 


Pb 


Hence  for  bounded  relative  error  it  is  necessary  that  K  grows  as  fast  as 
1/pb,  and  because  of  this  standard  Monte  Carlo  simulation  is  rarely  used  to 
estimate  rare  event  probabilities. 

An  alternative  approach  to  the  problem  of  estimating  small  probabil¬ 
ities  is  importance  sampling,  where  instead  of  sampling  from  the  original 
distribution  samples  are  drawn  from  a  new  distribution  under  which  the 
rare  events  are  no  longer  rare.  More  specifically,  iid  samples  of  the  ran¬ 
dom  variable  ^^1  are  drawn,  where  Sn  =  Xi  and  the  vector 

(Ai,...,A„)  has  an  alternative  distribution,  say  The  corresponding 
importance  sampling  estimator  is  just  the  sample  average  of  iid  copies  of 


Pb  =  I. 


dp 


(^1, 


where  p  denotes  the  distribution  of  (Ai, . . . ,  A„).  Clearly  this  estimator  is 
unbiased.  The  goal  of  importance  sampling  is  to  choose  so  as  to  minimize 
the  variance,  or  equivalently,  the  second  moment  of  pp. 


E  [pI]  =  E 


hs^>b}-^{Xl,---,Xn) 


It  turns  out  that  solving  for  the  unconstrained  minimization  problem 
over  all  possible  distributions  requires  knowing  pb.  Instead,  one  typically 
searches  within  a  parametric  family  of  changes  of  measure  and  looks  for  a 
distribution  that  satisfies  an  optimality  criterion.  Jensen’s  inequality  implies 


E  [pl]  >  {E[pb]f  =pI 

thus  giving  a  lower  bound  on  the  second  moment.  A  change  of  measure 
is  said  to  be  asymptotically  optimal,  or  have  asymptotically  optimal  relative 
error,  if 

lim  =  E[l{s^^b}dp/d,yUX,,...,Xn)] 

h _ 


=  1.  (1.1) 
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One  would  like  to  construct  schemes  whose  asymptotic  relative  error  is  close 
to  or  equal  to  this  minimal  value  1. 

In  [6,  7]  it  was  shown  that  ideas  from  stochastic  control  and  game  theory 
can  be  used  effectively  in  the  design  of  importance  sampling  schemes  for 
random  variables  with  finite  moment  generating  functions.  This  paper  is 
concerned  with  sums  of  non-negative  random  variables  with  heavy  tailed 
distributions  (by  which  we  mean  E[exp{tXi)]  =  oo  for  all  t  >  0).  For  this 
setup,  there  was  no  general  theory  for  choosing  sampling  distributions 
that  satisfy  this  asymptotic  optimality  criterion,  or  even  distributions  that 
have  uniformly  (in  b)  bounded  relative  error.  A  goal  of  the  current  paper 
is  to  demonstrate  that  the  techniques  of  control  theory  can  again  serve  as 
basic  tools  in  the  design  and  analysis  of  asymptotically  optimal  importance 
sampling  schemes  for  heavy  tailed  distributions. 

The  paper  is  organized  as  follows.  Section  2  introduces  a  parametric 
family  of  alternative  sampling  distributions  (i.e.,  controls)  v^.  In  Section 
3,  we  use  weak  convergence  arguments  to  show  that,  when  the  number  of 
summands  is  fixed,  such  changes  of  measure  induce  estimators  with  bounded 
relative  errors.  Moreover,  one  can  always  identify  nearly  asymptotically 
optimal  schemes  in  the  sense  that  the  corresponding  importance  sampling 
estimators  come  within  an  (arbitrarily)  prescribed  error  of  the  absolute  lower 
bound  1  in  (1.1).  In  Section  4  we  adapt  this  construction  to  estimate 

Pb  =  P{Xi  A/v"  >  6) 

when  A  is  a  random  variable  that  is  independent  of  {Aj,  i  G  N}  and  satis¬ 
fies  E\z^]  <  oo  for  some  z  >  1.  For  this  case  we  are  also  able  to  identify 
importance  sampling  schemes  that  are  nearly  asymptotically  optimal.  Sec¬ 
tion  5  presents  a  collection  of  numerical  results.  We  compare  our  scheme 
with  two  existing  simulation  methods,  one  of  which  is  based  on  conditional 
Monte  Carlo  rather  than  importance  sampling  [1] ,  and  the  other  is  based  on 
delayed  hazard  rate  twisting  [9] .  It  is  worth  mentioning  that  the  conditional 
Monte  Carlo  algorithm  produces  estimates  that  have  bounded  relative  er¬ 
rors,  although  it  is  not  known  whether  they  satisfy  the  asymptotic  optimality 
criterion. 

2  Problem  setup 

Consider  a  sequence  of  iid  non-negative  random  variables  {Aj,  i  G  N}  with 
tail  probability  E(x)  =  F’(Aj  >  x).  Let  Sn  =  Ai  A„.  Assume  that. 
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for  some  a  >  0,  the  function  F  satisfies 


lim  =  a  "  for  all  a  >  0.  (2.1) 

b^oo  F{b) 

A  random  variable  with  this  property  is  said  to  have  regularly  varying  tails. 
It  is  well  known  that  such  random  variables  are  subexponential  [1,  page  253, 
Proposition  1.4]  in  the  sense  that 


l.m  >  I 

b—^oo  P(yX\  !>  h) 


=  n 


(2.2) 


for  every  n  G  N.  An  in-depth  account  of  heavy-tailed  distributions  can  be 
found  in  [8]. 

We  wish  to  estimate  P{Si\f  >  b)  when  6  is  a  large  positive  number  and 
N  is  an  N-valued  random  variable  independent  of  {Aj}.  In  preparation,  we 
first  study  the  special  case  where  N  =  n  is  a  fixed  number.  As  discussed  in 
the  Introduction,  the  samples  are  drawn  from  an  alternative  distribution 
Our  goal  is  to  find,  for  each  e  >  0,  measures  (we  omit  the  e-dependence 
in  the  notation)  such  that 


1-  ■  ■  ■yXn)] 

P{Sn  >  6)2 


(2.3) 


When  e  is  small,  the  importance  sampling  scheme  based  on  achieves  a 
nearly  asymptotic  optimal  relative  error  [compare  with  (1.1)].  Using  the 
subexponential  property  (2.2),  (2.3)  reduces  to 


y  dd^[d{Sr,>b}dti/ dUni^l:  ■■■7  Xn)] 

F(6)2 


(2.4) 


The  algorithm  for  this  special  case  can  then  be  adapted  to  case  where 
N  is  random.  This  extension  will  be  discussed  in  Section  4. 


Remark  2.1  We  will  assume  throughout  that  the  random  variable  Aj  has 
a  density  /.  This  condition  is  not  essential  and  is  imposed  simply  for  con¬ 
venience  of  exposition. 


2.1  A  parameterized  family  of  sampling  distributions 

In  the  setting  of  light-tailed  random  variables  (i.e.,  those  with  finite  moment 
generating  functions  in  a  neighborhood  of  the  origin),  it  is  customary  to 
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consider  sampling  distributions  that  belong  to  the  class  of  “exponential  tilts” 
and/or  their  mixtures,  and  indeed  one  can  obtain  very  good  results  by  doing 
so. 

However,  the  situation  is  less  clear  for  random  variables  with  regu¬ 
larly  varying  tails.  A  contribution  of  the  present  paper  is  the  identifica¬ 
tion  of  a  class  of  sampling  distributions  that  can  yield  asymptotically  op¬ 
timal  performance  and  are  simple  to  implement.  The  main  requirement  is 
that  one  should  be  able  to  sample  from  the  tail  distribution  with  density 
f{y)I{y>c}/F{c)  for  all  c  >  0. 

Fix  n  G  N.  Each  distribution  in  our  class  will  be  determined  by  parame¬ 
ters  {a,pi^n,  Qi,n),  where  a  G  (0, 1)  and  {pi^m  Qi,n-,  —  l}isa  sequence 

of  non-negative  numbers  such  that  pi^n  +<li,n  =  1  and  qi^n  >  0  for  every  i.  It 
is  easiest  to  describe  the  distribution  of  interest  as  that  induced  by  random 
variables  (Ti,n)  Y2,m  •  •  • ,  Yn,n)-  Here  has  the  density 

fl,n{y)  Pl,nf{y)  T  Ql,n~pJJI^^I{y>ab}- 


For  1  <  i  <  n  the  conditional  density  of  1/^^,  given  Si-i^n  =  +  •  •  •  + 

d/— i,n  —  'Sj— 1  ^  b,  is 


flniy)  =  Pi, nf{y)  +  Qi,; 


f{y) 


and  =  /  if  >  b.  Lastly,  if  <  b  then  Y^^n  has  conditional  density 


fn,niy)  = 


f{y)  r 


and  otherwise  the  conditional  density  of  Y^^n  is  /. 

Note  that  it  is  not  difficult  to  simulate  from  this  distribution.  When 
drawing  the  sample  1/^^,  if  Si-i^n  =  Sj-i  <  b  then  one  first  ffips  a  coin  that 
is  heads  with  probability  pi^n-  If  heads  comes  up  then  we  sample  from  the 
original  distribution.  Otherwise  we  sample  from  the  original  distribution 
conditioned  on  the  event  that  the  outcome  is  greater  than  a{b  —  Sj_i).  If 
Sj_i  >  b  then  of  course  we  sample  from  the  original  distribution. 


Remark  2.2  If  Sj_i  <  b,  then  (6  —  Si_i)  is  the  residual  distance  to  go 
before  the  sample  sum  exceeds  the  threshold  b.  The  role  of  the  parameter 
a  G  (0, 1)  is  to  determine  how  close  we  will  come  to  jumping  all  the  required 
distance  when  the  coin  turns  up  tails  (except  for  i  =  n).  Since  a  <  1  we 
do  not  attempt  to  jump  over  the  threshold  w.p.l,  but  rather  with  positive 
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probability  we  come  close  to  but  not  over  the  threshold.  It  will  turn  out  that 
the  asymptotic  performance  (as  b  ]  oo)  depends  on  a,  and  that  as  a  t  1  this 
asymptotic  performance  approaches  optimality.  Hence  it  is  tempting  to  use 
a  =  1  in  the  prelimit  also.  However,  it  turns  out  that  the  limits  all  and 
6  t  oo  do  not  permute.  As  a  consequence,  the  corresponding  importance 
sampling  scheme  does  not  even  achieve  good  asymptotic  performance  if  one 
sets  a  =  1  in  the  prelimit. 

3  Near  asymptotic  optimality  for  fixed  n 

In  this  section  we  analyze,  via  weak  convergence  methods,  the  asymptotic 
performance  of  the  parametric  family  of  changes  of  measure  defined  in  Sec¬ 
tion  2.  For  each  fixed  choice  of  the  parameters  {o:,Pi,n,Qi,n)  [i-e.,  controls], 
we  obtain  a  cost.  Thus  finding  a  good  change  of  measure  amounts  to  solv¬ 
ing  a  deterministic,  discrete  time  control  problem.  Nearly  optimal  controls 
are  identified,  which  in  turn  yield  asymptotically  nearly  optimal  changes  of 
measure  for  the  importance  sampling  problem. 

3.1  A  weak  convergence  analysis 

Proposition  3.1  Fix  n  G  N.  Let  he  the  importanee  sampling  distribu¬ 
tions  defined  in  Seetion  2  with  parameters  {a,pi^n,  Qi,n)-  Then 

1.  Tl[I^g^yi,jdp/di'^{Xi, . . . ,  Xn)]  -p-j-  1  _Q,  ^  TT  ^ 

- m - 

The  limit  will  be  shown  using  weak  convergence  methods.  After  setting 
up  the  notation,  we  present  a  few  preliminary  results  before  returning  to 
the  proof  of  the  proposition.  To  begin,  we  rewrite  the  expected  value  as 

where  y  =  (yi, . . . ,  yn)-  Define 

K  =  {y  G  M”  :  yi  H - h  yn  >  1} 


and  a  family  of  measures  on  by 


e\A) 


p{b{AnK)) 

m 
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(recall  that  /i  is  the  product  probability  measure  induced  by  the  iid  random 
variables  Xi, . . . ,  Xn)-  Then  the  integral  can  be  rewritten  in  the  form 


1  dfi 

F{b) 


{by)e\dy). 


For  the  rest  of  the  proof  we  use  the  definitions  =  max(Xi, . . . ,  X^)  and 
Sn  =  +  •  •  •  +  Xn- 


Lemma  3.2  If  {Xi, . . . ,  Xn)  are  iid  non-negative  random  variables  with  the 
subexponential  property,  then 

lim  P{Mn  <  &  I  5'„  >  6)  =  0. 

6^00 

Proof.  Observe  the  following  result  due  to  the  inclusion-exclusion  principle: 


P{Mn  >h)=  nP{Xi  >h)  +  C2{P{Xi  >  6))2  +  .  .  .  +  Cn-l{P{Xi  >  b)) 

where  C2,  ■  ■  ■  ,Cn  are  some  constants.  It  follows  that 

..  P{Mn  >  b) 
iim  — ; - —  =  n. 

b— >00  P(yX\  >  b) 

Thanks  to  the  subexponential  property  (2.2), 


n— 1 


P{Mn  <b\Sn>b)  =  l-  P{Mn  >  b  \  Sn  >  b)  =  1  -  ^  ^  0. 

This  completes  the  proof.  ■ 


We  can  now  analyze  the  weak  convergence  of  as  6  ^  oo.  Although  the 
0^’s  are  not  necessarily  probability  measures,  there  is  an  obvious  extension 
of  the  notion  of  weak  convergence  to  non-negative  measures  with  uniformly 
bounded  mass  [5,  page  373] .  In  the  following  Oj  is  defined  as  the  probability 
measure  on  generated  by  the  random  vector  (T/, . .  ■  ,Yn),  where  Yl  =  0 
for  i  ^  j  and  Y^  has  density 

Lemma  3.3  9^  ^  9  =  %• 

Proof.  For  any  vector  a  G  \  {0}  define  the  rectangle  Ra  =  {y  G  : 
Vi  <  oi, . . . ,  <  an}-  Since  these  rectangles  are  convergence  determining 
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[3,  Example  2.3]  and  limf,^oo  )  =  n  =  )  thanks  to  (2.2),  it  suffices 

to  show  that 

lim  0\Ra)  =  e{Ra)  (3.1) 

b—*oo 

for  all  those  a  G  such  that  6{dRa)  =  0. 

To  this  end  we  first  consider  the  case  max{ai, . .  .,0^}  <  1.  It  is  ob¬ 
vious  that  6{Ra)  =  0,  so  we  only  have  to  prove  6^{Ra)  0.  This  follows 

immediately  from  Lemma  3.2  and  the  subexponential  property  (2.2),  since 


e\Ra)  < 


P{Mn  <b,Sn>  b) 

m 


P{Mn  <b\Sn>b) 


P{Sn  >  b) 

m 


0. 


Next  consider  the  case  max{ai, . . . ,  a^}  >  1,  and  without  loss  of  gener¬ 
ality  assume  that  Uj  >  1  for  1  <  j  <  k  only.  We  can  also  assume  that  a,  >  0 
for  every  i  since  6{dRa)  >  0  otherwise.  Define 


Uo  =  {yi  <1, . .  .,yk  <  l,yfc+i  <  a-k+i,  ■■■,yn<  On}, 


and  for  1  <  j  <  k 


Uj  =  {yi  <  1, . <  1, 1  <  %■  <  aj,yj+i  <  a^+i,  ...,yn<  On}- 

Clearly  the  Uj's  are  disjoint  and  Ra  =  Uq  U  Ui  U  ■  ■  ■  U  Uk-  All  we  need  to 
show  is  that  0^{Uj)  0{Uj)  for  every  0  <  j  <  /c. 

The  convergence  of  6^{Uo)  0{Uo)  =  0  is  already  established  since 

Uq  =  Ra  where  a  =  (!,...,!,  a^+i, . . . ,  an)  and  max{di, . . . ,  a„}  <  1.  It 
remains  to  show  for  the  case  where  j  >  1-  Using  the  definition  of  9  and  the 
fact  that  6j  is  supported  on  points  where  y*  =  0  if  i  7^  j,  we  see  that 

9{U,)  =  e,{U,)  =  a£  =  1  -  aj". 

Since  Uj  C  K,  it  follows  from  the  definition  of  6^  that 

0\U,)  =  ^P{(Xi,...,A„)g6C/,} 

=  -^P{b  <  X,  <  ajb)  •  YI  P{Xi  <  6)  •  n  ^ 

i<j  i>j 

^  i<j  i>i 

Since  Oj  >  0  for  every  i,  the  regularly  varying  tail  property  implies 

lim  e\Uj)  =  1  -  a-"  =  e{Uj). 

b—*oo 

This  completes  the  proof.  ■ 


Lemma  3.4  There  exists  M  <  oo  sueh  that  for  any  b  £  [0,  oo)  and  any 
ye  K, 


1  dfj, 


{by)  <  M. 


Proof.  For  y  £  K  set  sq  =  0,  Sj  =  yi  +  •  •  •  +  yj,  and  define  r(y)  =  min{j  > 
1  :  Sj  >  1}.  We  consider  the  cases  r(y)  =  n  and  r(y)  <  n  separately. 


Case  1:  Assume  for  now  that  y  G  K  and  r(y)  =  n.  Then  by  definition  of 


1  dfi 

F{b)  dn^ 


(by) 


F{b{l-Sn-l)) 


n—1 


m 


1 


+ 


n(„. 

j=l 

F{ab{l- Sj-i))  ^ 


Pj,nF{ab{l  -  Sj_i))  + 


Consider  the  decomposition  K  =  Ki  U  K2  where  Ki  =  {y  G  M”  :  yj  < 
a(l  —  Sj-i)  for  1  <  y  <  n  —  1  and  Sn  >  1}  and  K2  =  K  \  Ki. 

For  y  G  Ki,  it  is  not  difficult  to  argue  by  induction  that  sy  <  1  —  (1  —  a)^ 
for  1  <  j  <  n  —  1.  Therefore, 


1  dfj, 

F{b)  dn^ 


(by) 


F{b{l  —  Sn-l))  1 

J=1 


F{b) 


< 


F{b{l  -  a) 

FW 


i-i\ 


1 


n, 

j=i 


(3.2) 


For  any  y  G  K2,  let  J  =  {j  '■  yj  >  a(l  —  Sj-i),j  =  1, . . . ,  n  —  1},  which  is 
non-empty.  Define  j*  to  be  the  smallest  element  in  J,  and  let 


q  =  va:ui{qj^n  :  1  <  y  <  n  -  1}.  (3.3) 

Note  that  for  all  y 

F{ab{l  -  Sj_i))  ^  1 

Pj,nF{ab{l  Sj_i))  -|-  qj^n  Pj,n 
Then  the  following  bound  is  obtained: 

1  dp  ,,  N  _  F{b{l  —  gn-l))  _  TT  F{ab{l  —  gy-i))  _  -r-r  1 
F{b)dv^  ^  F{b)  j^jPj^nF{ab{l- Sj_i))  + qj^n 

^  _ F{ab{l  -  Sj*-i)) _ ^  -i-j- 

F{b)  pj*^nF{ab{l  -  Sj._i))  -h  qj*,n  Pj,n 

^  1  F{ab{l  -  Sj*-i))  ^  -p-r 

F{b)  q  jJ[Pj,n 
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Since  for  every  /c  G  {1, . . . ,  —  1}  we  have  yk  <  a(l  —  Sk-i),  induction  yields 

that  Sk  <  1  —  (1 —  a)^  for  all  such  A:’s.  In  particular,  <  1  — (1  — < 

1  —  (1  —  .  Thus  for  every  y  G  K2 


Vr  1 

mdi^V  qF{h) 


(3.4) 


Thanks  to  (3.2),  (3.4),  observing  q  <  1  and  min{(l  —  a)”  ^,a(l  — 
^)n-2}  >  a(i  _  we  obtain  the  bound 


<  1  rr  ^ .  nba{\-ar-^) 

F(h)  du^  -  q  pj^n  F{b) 


(3.5) 


for  every  y  G  K  and  r(y)  =  n. 


Case  2:  Assume  that  y  G  K  and  r  =  r(y)  <  n.  In  this  case  we  have 

r  — 1 


F{ab{l  -  Sr-i)) 


F{b)  dub 


n 


F{b)  Pr,nF{ab{l  -  Sr-l))  +  qr,n  \Pj, 


1 


-I. 


+ 


F{ab{l  -  Sj_i)) 


Pj,nF{ab{l-Sj_i))  +  qj^ri 


< 


1  F(a6(l  -  s^_i)) 


r— 1 


Qr,n 


Fib) 


1 


yj,n 


F(a6(l  -  Sj_i)) 


^ Pj,nF{abil  -  Sj_i))  + 

Using  the  same  argument  in  Case  1  (replace  n  by  r),  we  obtain  the  bound 


F{b)  dub  1  -  q  H 


(3.6) 


for  y  G  K  and  r  =  r(y)  <  n. 

To  summarize,  since  q  <  q^-^n  and  pj^n  <  I,  (3.5)  and  (3.6)  imply  that 


F{b)dub^^^-  q^f}^p,^^  Fib) 


(3.7) 
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for  every  y  G  K.  Because  F  is  assumed  to  have  regularly  varying  tails,  the 
right-hand-side  of  (3.7)  is  bounded  from  above  by  a  constant  independent 
of  6.  ■ 


Proof  of  Proposition  3.1.  For  each  1  <  i  <  n  define  the  set  yl,  =  {y  G 
:  Uk  <  a(l  —  'Sfc-i),  1  <  k  <  i  —  1,  Si  >  1}.  Note  that  for  1  <  i  <  n  —  1 
and  for  y  G  as  6  tends  to  infinity, 


1  dfJ,  .  _  _l _ F{ab{l  -  Sj-i))  V-r  1 

F{b)  ^  F{b)  pi^nF{ab{l  -  -h  qi^n  Pj,n 

^  (3.8) 

1=1 


Since  1  —  Sj_i  >  (1  —  a)*“^  in  yl*  the  convergence  is  uniform  for  all  y  in 
This  follows  from  a  well  known  theorem,  which  states  that  if  F  is  of  regular 
variation  then 

,  Fiax) 
hm  - 
x^oo  F{x) 

is  uniform  for  a  in  any  compact  subset  of  (0, 1]  [4,  page  22,  Theorem  1.5.2]. 
Similarly  note  that  on  An  we  have  the  following  uniform  convergence: 


1  dfi 

F{b)  diyb 


{by) 


F{b{l-  Sn-l))  TT  _2_ 

m  y  Px. 


n— 1 


J=1 


(3.9) 


Let  yl  =  UiLi  y  be  a  bounded  continuous  function  on  that 

satisfies 

y(y)  =  dm  {by)  ,  for  every  y  G 

Such  g  always  exists  since  the  closures  of  Ai  are  disjoint. 

Thanks  to  the  uniform  convergence  and  that  6^  has  bounded  mass,  we 
have 

lim  [  ^^{by)0\dy)  =  lim  [  g{y)e\dy). 
b^ooJ^F{b)duy 

Observing  that  9{dA)  =  0,  the  weak  convergence  9^  ^  9  implies  that 


hm  [  g{y)9\dy)  =  [  g{y)9{dy), 

b->oo  Jj^  Jj^ 

as  well  as 

9\A'^)  0(y4^)  =  0. 
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The  last  display,  Lemma  3.4  and  that  supp  {9^)  C  K,  in  turn  yield 


lim 


1  dfj, 


Ja<^  F{b) 

It  follows  that 

1  dfu, 


{by)e\dy)  =  0=  g{y)e{dy). 


lim  /  - 

b^oo  F{b)  du^ 


{by)e\dy)  =  /  g{y)9{dy). 


Since  the  support  of  9  is  those  y  =  (yi, . . . ,  y„)  where  yj  >  1  for  a  single  j 
and  y*  =  0  for  i  ^  j,  it  is  not  difficult  to  check  that,  thanks  to  (3.8)  and 
(3.9), 

«  n— 1  -  n—1  .  2—1  - 

/  ff(y)6'(dy) = n — — n — • 

Jvi  Pj,n  qi,n  fJl  Pj,n 

This  completes  the  proof.  ■ 


3.2  Solution  to  the  limit  problem 

In  this  section  we  argue  that  one  can  choose  {a,pi^n,  qi,n)  appropriately  so 
that  the  corresponding  change  of  measure  attains  nearly  asymptotically 
optimal  relative  error;  see  (2.4).  We  need  the  following  result. 


Lemma  3.5  Given  parameters  {a,pi^n,Qi,n),  define 


n—1 


1 


n—1  ^  i—1  ^ 

1  TT  i 


'^(®)  Pi,n:  Qi,n)  —  T  O,  ^  ^ 

Pj,n  Qi,n  Pj,n 

Then  for  any  fixed  a  G  (0, 1),  the  funetion  J{a]  ■,■)  is  minimized  at 
{n  —  k  —  l)a“"/^  +  1 


(n- A:)a-"/2  +  l 


with  minimum 


J*{a)  =  ((n-l)a-"/2  +  iy 


Proof.  We  use  an  argument  of  dynamic  programming  type.  For  1  <  k  <  n, 
define 


n—1 

Jkici'T  Pi,m  Qi,n)  — 

j=k 


1 

Pj,n 


n—1 

+«-“E 

i=k 


1 

Qi,n 


n 


j=k 


1 

Pj,n 
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and 


Vk{a)=  inf  Jk{a]Pi,n,qi,n)- 

\Pi,n  iQi^n  } 

Note  that  Jk  is  independent  of  those  {pi^m  Qi,n)  where  i  <  k  —  1  and  that 
the  original  problem  corresponds  to  A:  =  1  (i.e,  J  =  Ji).  It  is  not  difficult  to 
check  by  definition 

_  1  1 

Pi,m  Qi,n)  Q  H  Pi,m  Qi,n)  7 

Qk^n  Pk,n 

which  in  turn  yield  the  dynamic  programming  equation  (DPE) 

Vk{_^)  —  ^  H  h/i;-|-l(u)  .  Pk,n  ^  0,  Qk,n  ^  ^jPk^n  H“  Qk,n  —  ^  f  * 

t  Qk^n  Pk,n  } 

Since  Vn(a)  =  1  by  definition,  one  can  easily  use  backward  induction  (we 
omit  the  details)  to  show  that 

U(a)  =  ((n  -  , 

and  that  the  right-hand-side  of  the  DPE  is  minimized  at  {Pkn7qkn)-  This 
completes  the  proof.  ■ 


The  following  corollary,  which  states  the  existence  of  nearly  optimal 
importance  sampling  schemes,  is  immediate. 


Corollary  3.6  Let  e  >  0  be  given.  Then  there  exists  a  G  (0, 1)  sueh  that 
T*(a)  <  (l-l-e)n^.  Let  q*n)  be  the  optimal  weights  defined  as  in  Lemma 
3.5.  Then  the  ehange  of  measure  with  parameters  {a,pl^,q*^)  is  nearly 
asymptotieally  optimal  in  that 


lim 


E[L{s„^,}dp/du^^{X,,...,XnW 

Ffiby 


<  (i+e; 


n 


4  Importance  sampling  for  random  N 

In  this  section  we  address  the  problem  of  estimating 
Pb  =  P{^i  +  X-2  +  ■  ■  ■  X]\f  >  b) 

where  is  a  N-valued  random  variable  that  is  independent  of  {X^}.  Through¬ 
out  we  assume  E[z^]  <  oo  for  some  z  >  1.  Let  Sn  =  P{N  =  n)  and 
c  =  E[N].  Observe  that  {ns„/c}  defines  a  probability  measure  on  N. 
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Importance  sampling  algorithm:  The  scheme  is  parameterized  by  (ao,  oi,  K) 
where  oq  G  (0, 1),  oi  G  (0, 1  —  and  K  G  N.  Each  independent  sample 

is  constructed  in  the  following  fashion. 

•  Generate  a  random  variable  N  according  to  P{N  =  n)  =  ns^jc. 

•  \i  N  =  n  <  then  draw  the  random  vector  (Xi, . . . ,  X„)  from  the 

distribution  with  parameter  (ao,p|„,  where  (Pin^Qin)  the 

optimal  weights  defined  in  Lemma  3.5  with  a  =  qq. 

•  If  N  =  n  >  K,  then  draw  the  random  vector  (Xi, . . . ,  X„)  from  the 

distribution  with  parameter  (ai,p|„,  where  (Pin^Qin)  the 

optimal  weights  defined  in  Lemma  3.5  with  a  =  ai. 

•  Define 

The  importance  sampling  estimator  is  just  the  sample  average  of  indepen¬ 
dent  copies  of  ph. 


The  following  result  characterizes  the  asymptotic  performance  of  this 
importance  sampling  scheme. 

Theorem  4.1  Consider  the  importanee  sampling  seheme  with  parameter 
(ao,ai,iL).  Then 


1 


K 


n=l 


C£^ 

n 


-«/2 


(4.1) 


n=K+l 


In  partieular,  for  any  e  >  0,  there  exist  (ao,  ai,  K)  sueh  that 

Before  proceeding  with  the  proof,  let  us  check  that  this  indeed  describes 
a  nearly  asymptotically  optimal  scheme.  By  Jensen’s  inequality 


E  [pI]  >  {E[p,])^  =  P{Sn  >  h)\ 
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Also,  since  the  random  variables  A*  are  subexponential  and  E[z^]  <  oo  for 
some  z  >  1,  [1,  page  259,  Lemma  2.2]  asserts  that 


lim 

b—*QO 


P{SN>b) 

m 


E[N]  =  c. 


It  follows  that 

Hence  such  a  scheme  is  indeed  nearly  asymptotically  optimal. 


Remark  4.1  As  we  will  see,  the  introduction  of  the  cutoff  K  and  the  use 
of  a  different  parameter  ai  for  N  >  K  are  for  technical  reasons  in  order  to 
facilitate  an  interchange  needed  in  the  proof.  It  is  not  known  at  this  time  if 
this  setup  is  necessary,  or  if  one  can  work  with  a  single  parameter  uq  G  (0,1) 
and  K  =  oo. 


Proof  of  Theorem  4.1.  When  the  samples  are  generated  according  to  this 
scheme. 


1 

F(6)2 


e[pI] 


:E 


c  dp, 


K 


E{b) 

1 

F(6)2  ^ 

n=l 

1 

fW 


'N 

hSr.>b}-^iXi,  ...,Xn) 


CSr, 


n 


+ 


E 


n=K+l 


CSji 

n 


We  next  take  6  to  oo  in  the  previous  display.  Assume  for  now  that  the 
interchange  of  limit  and  the  infinite  sum  is  valid  -  the  justification  will  be 
given  momentarily.  Then  (4.1)  follows  immediately  from  Proposition  3.1 
and  Lemma  3.5.  Since  ao,ai  <  1  and  =  c,  it  is  not  difficult  to  see 

that  the  right-hand-side  of  (4.1)  is  bounded  from  above  by 


OO 


OO 


OO 


ECSn  2  —a  ,  2  —a  —a  2  ,  —a  //f  oN 

- n  Og  -|-  - n  tti  =  Qq  c  +  tti  c  2_^  (4.2) 

n=l  n=K+l  n=K+l 

For  any  e  >  0,  the  conclusion  of  the  theorem  follows  by  taking  K  large 
enough  and  og  sufficiently  close  to  1. 
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It  remains  to  justify  the  interchange  of  limit  with  the  infinite  sum.  This 
will  be  done  by  finding  a  dominating  function  for 


1 


CSji 

n 


when  n  >  K.  Recall  that  in  this  case  is  defined  with  parameter  (ai, 
where  q*n)  are  the  optimal  weights  given  by  Lemma  3.5  with  a  =  ai. 
By  inequality  (3.7)  we  have, 


^<lrf—  Hbaijl  - 

F{b)dubS>-  F{b) 


on  the  set  {x  G  :  xi  +  ■  ■  ■  +  Xn  >  b},  where  q  [defined  in  (3.3)]  is 


q  =  mm{ql^}  =  min{l  -  p*  „}  = 


-«/2 


<  1. 


(n  -  l)a]""^^  +  1 

Using  this  and  the  particular  form  of  the  weights  „  from  Lemma  3.5, 

/  ,,  N  3 


F{b) 

which  in  turn  implies 
1 


1  dp  ^  ^(6ai(l  -  ai)' 


m 


((n- l)ai"/^  +  l)  a?. 


m 


< 


^E\Iis„>b}-^{Xi,...,Xn) 


F(6ai(l  -  ai)”-i)  P{Sn  >  b) 


(4.3) 


m 


m 


(^(n  -  1)0^  +  1^  a". 


A  well  known  result  from  the  theory  of  subexponential  distributions  (see, 
e.g.,  [1,  page  255,  Lemma  1.8])  states  that  for  all  7  >  0  there  is  K{^)  such 
that  the  following  bound  holds  for  all  6  >  0: 


P{Sn  >  b) 

m 


<iL(7)(l  +  7)’ 


(4.4) 


Another  result  [4,  page  25,  Theorem  1.5.6]  states  the  following:  for  any  5  >  0 
there  exists  A{S)  >  1  such  that  for  all  0  <  y  <  x. 


F{y) 

F{x) 


*■««(;) 


\  S+a 
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(4.5) 


Now  choose  7,  5  >  0  so  that 


1+7 

(1  -  ai)"+^  ^ 

Such  7  and  S  always  exists,  thanks  to  the  assumption  that  0  <  ai  <  1  — 
2;“!/“.  We  now  apply  the  bounds  in  equations  (4.4)  and  (4.5)  to  inequality 
(4.3).  Observing  that  <  Cz~^  for  some  constant  C  since  E[z^]  <  00,  it 
is  not  difficult  to  show  that  there  is  a  finite  constant  C  such  that 


1 


n 


where  /3  =  (1  +  7)  (1  —  oi)  jz  <\.  The  right-hand-side  then  serves  as 

a  summable  dominating  function.  ■ 


5  Numerical  Results 


In  this  section  we  present  some  numerical  results  for  the  estimation  of 

Ph  =  P{^i  +  •  •  •  +  >  h) , 


where  the  Xj’s  are  iid  random  variables  with  regularly  varying  tails.  The 
simulation  results  from  the  algorithms  outlined  in  this  paper  are  denoted  by 
DIS  (for  dynamic  importance  sampling).  For  comparison,  we  also  include 
results  from  the  weighted  delayed  hazard  twisting  algorithm  of  [9] ,  denoted 
by  WDHT,  and  the  conditional  Monte  Carlo  algorithm  from  the  report  [2], 
marked  as  CMC. 

In  all  the  tables,  N"  is  a  random  variable  independent  of  {Xi\  with  distri¬ 
bution  P{N  =  n)  =  p{l  —  ioi  n  >  1.  In  Tables  1  and  2,  we  assume  Xi 
has  tail  distribution  P{Xi  >  b)  =  (1  +  6)“"  for  various  values  of  a,  while  Ta¬ 
ble  5  uses  tail  distribution  of  form  P{Xi  >  6)  =  (1+6/2)“^-^^ log(2+6)/log2. 

For  the  WDHT  algorithm  we  use  the  parameters  used  in  the  paper  [9] . 
For  our  algorithm  (described  in  Section  4),  we  must  choose  (oq,  ai,  K).  For 
a  given  e  >  0,  we  set 


and 


oo  =  (^1  +  2  j  >  “1 


l-(l-p)i/" 

2 


K  =  [max{— (51ogyl,  26^}  +  Ij ,  where  S  = - -==  and  A 

log  VI  -  P 


eaf 

2(1+7)' 
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Note  that  under  this  choice  of  (ao,ai,i^),  the  conditions  of  Theorem  4.1 
are  satisfied  and  the  right-hand-side  of  (4.1)  is  bounded  from  above  by 
(1  -|-  e)E[N]‘^  (see  the  Appendix),  so  that  the  scheme  is  nearly  asymptot¬ 
ically  optimal.  This  is  not  the  only  choice  that  has  such  properties.  The 
performance,  however,  does  not  vary  much  when  using  other  choices. 

It  has  been  shown  that  the  WDHT  algorithm  is  logarithmically  asymp¬ 
totically  optimal  (which  means  that  the  log  of  the  second  moment  divided 
by  the  log  of  the  probability  of  interest  converges  to  2  as  6  ^  oo),  and 
that  the  CMC  algorithm  has  bounded  relative  error  (though  not  necessarily 
nearly  asymptotically  optimal  relative  error).  The  numerical  results  show 
that  our  algorithm  has  the  best  performance  for  all  the  parameter  values 
considered,  with  the  standard  error  better  than  that  in  the  CMC  algorithm 
by  at  least  a  factor  of  10. 

Remark  5.1  It  is  not  standard  in  the  literature  on  this  topic  to  report 
simulation  results  for  deterministic  N.  However,  we  did  test  such  prob¬ 
lems,  and  in  some  cases  found  that  the  performance  of  our  algorithm  and 
CMC  was  similar.  We  conjecture  that  for  these  cases  the  CMC  algorithm 
is  actually  nearly  asymptotically  optimal,  even  though  there  is  no  proof  to 
support  this  conjecture.  In  all  cases  that  we  tested  where  N  was  random 
the  two  algorithms  were  not  comparable,  with  differences  similar  to  those 
in  the  presented  examples,  and  thus  for  random  N  it  seems  that  the  CMC 
algorithm  is  not  nearly  asymptotically  optimal.  In  all  cases,  both  algorithms 
out-performed  the  WDHT  algorithm. 


Appendix 

In  this  appendix  we  show  that  for  the  choice  of  (oq,  fli,  K)  defined  in  Section 
5,  the  right-hand-side  of  (4.1)  is  bounded  from  above  by  (1  +  e)E[N]‘^.  Note 
that  (4.2)  gives  an  upper  bound  for  the  right-hand-side  of  (4.1).  It  suffices 
to  show  that  for  this  choice  of  (uq,  ai,  K), 

OO 

Og  -|- aj""c  ^  nsn  <  (1 -|- e)c^, 

n=K+l 

which  is  itself  implied  by 


OO 

nsnlc< 

n=K^l 


eaf 
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Since  Sn  =  P{N  =  n)  =  p{l  —  />)^  c  =  E[N]  =  l//>,  we  need  to  show 

^  a 

n=iC+l 

But  simple  algebra  yields 

OO 

^  np^{l-p)^-^  =  {Kp+l){l- p)^  <K{p+l){l- p)^, 

n=K+l 

whence  it  remains  to  show 

K{1-  p)^  <A. 

To  this  end,  observing  that  K  >  2(5^  and  using  inequality  I2,  we 

have 

>  K. 

Therefore 

K{\  -  p)^  <  -  p)^  =  -p)Y  = 

This  completes  the  proof  since  K  >  — (51og^.  ■ 
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p 

b 

True  Value 

DIS 

WDHT 

CMC 

0.25 

le  +  06 

0.004 

0.003997 

2.037e-06 

[0.003993,0.004001] 

0.004018 

0.0001703 

[0.003678,0.004359] 

0.003954 

2.427e-05 

[0.003906,0.004003] 

Estimate 

Std.  Error 
Confid.  Interval 

le  +  12 

4e  —  06 

3.997e-06 

1.879e-09 

[3.994e-06,4.001e-06] 

4.253e-  06 

2.902e-  07 

[3.673e-  06,4.833e-  06] 

4.016e  —  06 

2.44e  -  08 

[3.967e-06,4.064e-06] 

le  +  18 

4e-09 

3.999e  —  09 

1.804e-  12 

[3.995e-09,4.002e-09] 

3.748e  -  09 

3.549e-  10 

[3.038e-  09,4.458e-  09] 

4.041e-09 
2.469e-  11 

[3.992e-09,4.091e-09] 

0.5 

le  +  06 

0.002 

0.002001 

6.507e-07 

[0.001999,0.002002] 

0.001948 

7.946e  -  05 
[0.001789,0.002107] 

0.00199 

1.002e-05 

[0.00197,0.00201] 

le  +  12 

2e-06 

2e-06 

6.897e-  10 

[1.999e-06,2.002e-06] 

1.941e-  06 

1.256e-  07 

[1.69e-06,2.192e-06] 

1.993e-06 

9.876e  -  09 

[1.973e-06,2.012e-06] 

le  +  18 

2e-09 

2e-09 

7.041e-  13 

[1.999e-09,2.001e-09] 

1.887e-  09 
1.644e-  10 

[1.558e-  09,2.216e-  09] 

1.999e  —  09 

9.888e-  12 

[1.98e-09,2.019e-09] 

0.75 

le  +  06 

0.001333 

0.001333 

3.643e-07 

[0.001332,0.001334] 

0.001372 

4.821e-  05 
[0.001275,0.001468] 

0.001335 

4.698e  —  06 
[0.001326,0.001345] 

le  +  12 

1.33e-06 

1.333e-06 

3.272e-  10 

[1.333e-06,1.334e-06] 

1.332e-  06 

7.492e  -  08 

[1.182e-  06,1.481e-06] 

1.343e-06 

4.796e  -  09 

[1.334e-06,1.353e-06] 

le+  18 

1.33e-09 

1.334e-09 

3.061e-  13 

[1.333e-09,1.334e-09] 

1.517e-  09 

1.085e-  10 
[1.3e-09,1.734e-09] 

1.334e-09 
4.681e-  12 

[1.325e-09,1.344e-09] 

Table  1.  Estimates  for  P{Xi  +  •  •  •  +  >  b)  where  P{N  =  n)  =  p(l  —  p)"~^  and  P(Xi  >&)  =  (!  +  All  the  results 

use  20,000  iterations.  The  true  value  is  obtained  by  running  our  algorithm  and  the  CMC  algorithm  for  500,000  iterations. 
The  tolerance  for  our  algorithm  is  set  to  e  =  0.01. 


p 

b 

True  Value 

DIS 

WDHT 

CMC 

0.25 

1000 

0.0001286 

0.0001279 

9.87e-  08 

[0.0001277,0.0001281] 

0.0001297 

7.482e  -  06 
[0.0001148,0.0001447] 

0.0001289 

8.049e  -  07 
[0.0001273,0.0001305] 

Estimate 

Std.  Error 
Confid.  Interval 

le  +  05 

1.266e-07 

1.266e-07 

5.337e-  11 

[1.264e-07, 1.267e-07] 

1.407e-  07 

1.092e-  08 

[1.189e-  07, 1.626e-07] 

1.255e-  07 

7.671e-  10 

[1.239e-  07, 1.27e-07] 

le  +  08 

4e-  12 

4.003e-  12 

1.527e-  15 
[4e-  12,4.006e-  12] 

3.938e-  12 

4.288e-  13 

[3.081e-  12,4.796e-  12] 

3.995e-  12 

2.443e  -  14 

[3.946e-  12,4.043e-  12] 

0.5 

1000 

6.352e-05 

6.341e-05 

1.84e-  08 

[6.337e-05,6.345e-05] 

6.286e-  05 

3.429e-  06 

[5.601e-  05,6.972e-  05] 

6.35e-05 

3.206e-  07 

[6.285e-05,6.414e-05] 

le  +  05 

6.329e-08 

6.324e  -  08 

2.26e-  11 

[6.32e-  08,6.329e-  08] 

5.864e-  08 

4.629e-  09 

[4.938e-08,6.79e-08] 

6.352e-  08 

3.199e-  10 

[6.288e-08,6.416e-08] 

le  +  08 

2e-  12 

2.001e-  12 

6.098e-  16 
[2e-  12,2.002e-  12] 

1.745e-  12 

1.864e-  13 

[1.372e-  12,2.118e-  12] 

2.005e-  12 

1.002e-  14 

[1.985e-  12,2.025e-  12] 

0.75 

1000 

4.218e-05 

4.217e-05 

7.219e-09 

[4.215e-05,4.218e-05] 

4.02e-05 

1.981e-  06 

[3.624e-  05,4.416e-  05] 

4.231e-  05 

1.506e-  07 

[4.201e-  05,4.261e-05] 

le  +  05 

4.218e-08 

4.217e-08 

9.719e-  12 

[4.215e-08,4.219e-08] 

4.154e-  08 

2.674e-  09 

[3.619e-  08,4.689e-  08] 

4.207e-  08 

1.479e-  10 

[4.177e-  08,4.236e-08] 

le  +  08 

1.33e-  12 

1.334e-  12 

3.061e-  16 

[1.333e-  12,1.334e-  12] 

1.376e-  12 

1.134e-  13 

[1.149e-  12,1.603e-  12] 

1.333e-  12 

4.69e-  15 

[1.324e-  12,1.343e-  12] 

Table  2.  Estimates  for  P{Xi  +  •  •  •  +  >  b)  where  P{N  =  n)  =  p(l  —  p)"~^  and  P(Xi  >&)  =  (!  +  .  All  the  results 

use  20,000  iterations.  The  true  value  is  obtained  by  running  our  algorithm  and  the  CMC  algorithm  for  500,000  iterations. 
The  tolerance  for  our  algorithm  is  set  to  e  =  0.01. 


p 

b 

True  Value 

DIS 

CMC 

0.25 

le  +  04 

5.951e-07 

5.939e-07 

2.756e-  11 

[5.939e-  07,5.94e-  07] 

5.935e-  07 

3.665e-  09 

[5.862e-  07,6.009e-  07] 

Estimate 

Std.  Error 
Confid.  Interval 

le  +  07 

3.678e-  13 

3.68e-  13 

4.075e-  17 

[3.679e-  13,3.681e-  13] 

3.684e-  13 

2.261e-  15 

[3.639e-  13,3.729e-  13] 

le  +  09 

2.371e-  17 

2.371e-  17 

3.749e-21 

[2.37e-  17,2.371e-  17] 

2.359e-  17 
1.449e-  19 

[2.33e-  17,2.388e-  17] 

0.5 

le  +  04 

2.965e-07 

2.964e  -  07 

1.739e-  11 

[2.964e-07,2.965e-07] 

2.99e-07 

1.505e-  09 

[2.96e-  07,3.02e-07] 

le  +  07 

1.84e-  13 

1.84e-  13 

1.84e-  17 

[1.839e-  13,1.84e-  13] 

1.83e-  13 

9.116e-  16 

[1.812e-  13,1.849e-  13] 

le  +  09 

1.186e-  17 

1.185e-  17 

1.186e-21 

[1.185e-  17, 1.186e-  17] 

1.19e-  17 

5.899e-  20 

[1.178e-  17, 1.201e-  17] 

0.75 

le  +  04 

1.976e-07 

1.975e-07 

5.908e-  12 

[1.975e-07, 1.975e-07] 

1.974e-  07 

7.05e-  10 

[1.96e-  07, 1.988e-07] 

le  +  07 

1.227e-  13 

1.227e-  13 

6.134e-  18 

[1.226e-  13,1.227e-  13] 

1.226e-  13 

4.315e-  16 

[1.217e-  13,1.235e-  13] 

le  +  09 

7.9e-  18 

7.903e-  18 

3.953e-22 

[7.903e-  18,7.904e-  18] 

7.938e-  18 

2.81e-20 

[7.881e-  18,7.994e-  18] 

Table  3.  Estimates  for  P{Xi  +  -  ■  ■  +  Xis[  >  h)  where  P{N  =  n)  =  p{l  —  and  P{Xi  >  b)  =  (l  +  &/2)“^'^^log(2  +  &)/log2. 
All  the  results  use  20,000  iterations.  The  true  value  is  obtained  by  running  our  algorithm  and  the  CMC  algorithm  for  500,000 
iterations.  The  tolerance  for  our  algorithm  is  set  to  e  =  0.01. 
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